This Page Is Inserted by IFW Operations 
and is not a part of the Official Record 



BEST AVAILABLE IMAGES 

Defective images within this document are accurate representation of 
The original documents submitted by the applicant. 

Defects in the images may include (but are not limited to): 

• BLACK BORDERS 

• TEXT CUT OFF AT TOP, BOTTOM OR SIDES 

• FADED TEXT 

• ILLEGIBLE TEXT 

• SKEWED/SLANTED IMAGES 

• COLORED PHOTOS 

• BLACK OR VERY BLACK AND WHITE DARK PHOTOS 

• GRAY SCALE DOCUMENTS 



IMAGES ARE BEST AVAILABLE COPY. 



As rescanning documents mil not correct images, 
please do not report the images to the 
Image Problem Mailbox. 



t HiS PAGE BUNK (uspto) 




iiiiiiiiinuiwKmiBS 



till 
















mi 




Hfti ATffi . IWlTIBMl THESE; BgEgEK®§> SHATtTi, COMlE^ 
UNITED STATES DEPARTMENT OF COMMERCE 
United States Patent and Trademark Office « 

December 04, 2000 

THIS IS TO CERTIFY THAT ANNEXED HERETO IS A TRUE COPY FROM 
THE RECORDS OF THE UNITED STATES PATENT AND TRADEMARK 
OFFICE OF THOSE PAPERS OF THE BELOW IDENTIFIED PATENT 
APPLICATION THAT MET THE REQUIREMENTS TO BE GRANTED A 
FILING DATE UNDER 35 USC 111. 

APPLICATION NUMBER: 09/559,352 
FILING DATE: April 27, 2000 

PRIORITY DOCUMENT 

RULE 17.1(a) OR (b) 




By Authority of the 
OMMISSIONER OF PATENTS AND TRADEMARKS 



T. LAWRENCE 
Certifying Officer 



iflminniiriimiiiimmmmmin^ 



c 



UTILITY PATENT APPLICATION TRANS ft/1 DTTAL 

Submit an original and a duplicate for fee processing 
(Only for new nonprovisional applications under 37 CFR 1.53(b)) 



Address to: 

Assistant Commissioner for Patents 
Box Patent Application 
Washington, D.C. 20231 



APPLICATION ELEMENTS 



1 . S Transmittal Form with Fee 

2. ^ Specification (including claims and 

abstract) [Total Pages 22] 

3. fx] Drawings [Total Sheets 2] 

4. ^ Oath or Declaration [Total Pages 3] 

a. [E3 Newly executed 

b. □ Copy from prior application 

[Note Boxes 5 and 17 below] 
i. □ Deletion of Inventor(s) Signed 

statement attached deleting inventory) 
named in the prior application 

5. □ Incorporation by Reference: The entire 

disclosure of the prior application, from which a 
copy of the oath or dedaration is supplied under 
Box 4b, is considered as being part of the 
disclosure of the accompanying application and is 
hereby incorporated by reference therein. 

6. □ Microfiche Computer Program 

7. □ Nucleotide and/or Amino Acid Sequence 

Submission 

a. □ Computer Readable Copy 

b. □ Paper Copy 

c. □ Statement verifying above copies 



Attorney Docket No. 
First Named Inventor 
Express Mail No. 
Total Pages 



MBHB00-353 
Michael Kagan 
EL442908923US 
43 



ACCOM PANYI NG APPLICATION PARTS 



8. Kl Assignment Papers 

9. S Power of Attorney 

10. □ English Translation Document (if 
applicable) 

□ Information Disclosure Statement (IDS) 

□ PTO-1449 Form 

□ Copies of IDS Citations 

□ Preliminary Amendment 
Return Receipt Postcard 
(Should be specifically itemized) 

14. £3 Small Entity Statement(s) 
ED Enclosed 

□ Statement filed in prior application; 
status still proper and desired 

□ Certified Copy of Priority Document(s) 
S Other: Transmittal Letter/Submission of 

of Formal Drawings 
Certificate of Express Mailing 



11. 



12. 
13. 



15. 
16. 



17. Iff a CONTINUING APPLICATION, check appropriate box and supply the requisite information: 
□ Continuation □ Divisional □ Continuation-in-part of prior application Serial No. 



«CN3 

o<y\ 
mo ; 

U i 



3C 



APPLICATION FEES 


BASIC FEE 


$690.00 


CLAIMS 


NUMBER FILED 


NUMBER EXTRA 


RATE 




Total Claims 


27 -20= 


7 


x $18.00 


$126.00 


independent 


3 -3= 


0 


x $78.00 


$ 


Claims 










PI Multiple DeDendent Claimsfs) if applicable 


+$270.00 


$ 


Total of above calculations = 


$ 


Reduction bv 50% for filing by small entity = 


$(408.00) 


Assignment fee if applicable 




+ $40.00 


$40.00 


TOTAL = 


$448.00 



[Page 1 of 2] 



UTILITY PATENT APPLICATION TRANSMITTAL [ Attorney Pocket No. MBHB00-353 



18. □ Please charge my Deposit Account No. 13-2490 in the amount of $ 

19. ^ A check in the amount of $448.00 is enclosed. 

20. The Commissioner is hereby authorized to credit overpayments or charge any additional fees of 
the following types to Deposit Account No. 13-2490: 

a. S Fees required under 37 CFR 1 .1 6. 

b. E3 Fees required under 37 CFR 1 .1 7. 

c. El Fees required under 37 CFR 1.18. 

21 . □ The Commissioner is hereby generally authorized under 37 CFR 1 .1 36(a)(3) to treat any 

future reply in this or any related application filed pursuant to 37 CFR 1 .53 requiring an 
extension of time as incorporating a request therefor, and the Commissioner is hereby 
specifically authorized to charge Deposit Account No. 13-2490 for any fee that may be due 
in connection with such a request for an extension of time. 


22. CORRESPONDENCE ADDRESS 


Name 


McDonnell Boehnen Hulbert & Berghoff 


Address 


32 nd Floor, 300 South Wacker Drive 


City, State, Zip 


Chicago, Illinois 60606 


i 23. SIGNATURE OF APPLICANT, ATTORNEY, OR AGENT REQUIRED 


Name 


Amir N. Penn, Reg. No. 40,767 


Signature 




Date 


April 27, 2000 



UTILITY (Rev. 11/18/97) 



[Page 2 of 2] 



1 1 CERTIFICATE OF MAILING 



; c 



(PATENT) 
Express Mail No. EL442908923US 



I3 Dep sited April 27, 2000 

i ° 

I hereby certify that the attached correspondence, identified below, is being deposited 
with the United States Postal Service as "Express Mail Post Office to Addressee" under 37 CFR 
1.10 on the date indicated above and is addressed to the Assistant Commissioner of Patents, 
Washington, D.C. 20231. 




O 
*H 

In Application for Patent of Michael Kagan, Diego Crupnicoff, Freddy Gabbay and Shimon 
Ul Rottenberg 

*D 

W Title: SYNCHRONIZATION OF INTERRUPTS WITH DATA POCKETS 

In 
rU 

L. X Patent Application (including cover sheet, 22 pages of 

specification and 2 pages of drawings 

Cj X Utility Patent Cover Sheet 

Q X Declaration and Power of Attorney 

X Assignment 

X Verified Statement Claiming Small Entity Status 

X Transmittal/Submission of Formal Drawings 

X Return Receipt Postcard 



_X Check in the amount of $448.00 



Case No. MBHB00-353 



36895S1 



SYNCHRONIZATION OF INTERRUPTS WITH DATA PACKETS 

FIELD OF THE INVENTION 

The present invention relates generally to computing 
systems, and specifically to systems that use packet- 
switching fabrics to connect a computer host to 
peripheral devices . 

BACKGROUND OF THE INVENTION 

In current-generation computers, the central 
processing unit (CPU) is connected to the system memory 
and to peripheral devices by a parallel bus, such as the 
ubiquitous Peripheral Component Interface (PCI) bus. As 
data path-widths grow, and clock speeds become faster , 
however, the parallel bus is becoming too costly and 
complex to keep up with system demands. In response, the 
computer industry is moving toward fast, packetized, 
serial input/output (I/O) bus architectures, in which 
computing hosts and peripheral are linked by a switching 
network, commonly referred to as a switching fabric. A 
number of architectures of this type have been proposed, 
including "Next Generation I/O" (NGIO) and "Future I/O" 
(FIO), culminating in the "InfiniBand" architecture, 
which has been advanced by a consortium led by a group of 
industry leaders (including Intel, Sun, Hewlett Packard, 
IBM, Compaq, Dell and Microsoft) . Storage Area Networks 
(SAN) provide a similar, packetized, serial approach to 
high-speed storage access, which can also be implemented 
using an InfiniBand fabric. 

In a parallel bus-based computer system, when a 
peripheral device needs to deliver data to the CPU, it 
typically writes the data to the memory over the bus, 
using direct memory access. When the peripheral has 
finished writing, it asserts an interrupt to the CPU on 
one of the interrupt lines of the bus . Bus arbitration 
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ensures that the CPU will not attempt to read the data 
from the memory until the writing of the data is 
complete. On the other hand, when the peripheral device 
and the CPU are connected by a packet-switching fabric, 
such as an InfiniBand fabric, they operate 
asynchronously. Furthermore, the data sent to the memory 
and the interrupt to the CPU travel over different paths, 
or channels. Typically, a separate line or channel is 
provided to connect the interrupt pin of the peripheral 
device to an interrupt controller of the CPU, bypassing 
the switching fabric. Therefore, there is no a priori 
assurance that all of the data will have been written to 
the memory before the CPU begins reading. 

The "race" between the interrupt path and the data 
path can result in errors (as when a CPU read stalls the 
data) . Care must therefore be taken to synchronize data 
and interrupt handling and to make sure that the data 
have been completely written to the memory before the CPU 
attempts to read it. 

A common solution in this situation is to program 
the CPU to access the peripheral device before accessing 
the memory, typically by performing a M conf iguration 
read" from the peripheral device. In this mode of 
operation, after the peripheral device has asserted the 
interrupt to the CPU (indicating that the last item of 
data has been sent to the memory) , the CPU issues a read 
request through the switching fabric, to read an 
interrupt cause register in the peripheral device. The 
peripheral device responds to the read request by sending 
a packet containing the interrupt cause to the CPU over 
the same channel as it used to send the data to the 
memory. Since packets are ordered within a channel, the 
response to configuration read arrives at the CPU after 
all of the previous writes have been flushed to memory. 
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The CPU begins to read the data from the memory only 
after it has received the interrupt cause packet back 
from the peripheral device. The configuration read thus 
serves two crucial purposes : it provides the CPU with the 
cause information that it needs in order to serve the 
interrupt, and it ensures that the CPU reads the memory 
only after all of the data have been written there. 

This scheme has a number of serious performance 
drawbacks, however. Every interrupt sent by the 

peripheral device necessitates an additional exchange of 
messages through the switching fabric between the CPU and 
peripheral device. The exchange adds substantial latency 
- typically 10 microseconds or more - every time the CPU 
must . service an interrupt. Furthermore, since 

configuration reads are used as synchronization barriers, 
the CPU is stalled from the moment the configuration read 
request is issued until its response has arrived. 
Valuable CPU time is therefore wasted waiting for the 
interrupt cause to be retrieved. 

U.S. Patent 5,689,713, whose disclosure is 
incorporated herein by reference, describes a method for 
interrupt request handling in a packet-switched computer 
system. The system may include a number of interrupt 
sources, which direct interrupts to any of a number of 
interrupt handlers. A system controller acts as an 
intermediary between interrupting devices and 
"interruptees ." It includes an interrupt queue coupled 
to each interrupt source for receiving multiple interrupt 
requests, and an output queue coupled to each interrupt 
handler. The controller thus enables asynchronous data 
from multiple sources to be conveyed across a packet- 
switched interconnection, while providing a dedicated 
channel for interrupts associated with the data packets. 
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SUMMARY OF THE INVENTION 

It is an object of the present invention to provide 
an improved method and system for passing data packets 
and associated interrupts through a switching fabric. 

It is a further object of some aspects of the 
present invention to provide a method and system for 
communication between a CPU and peripheral devices via a 
switching fabric that ensures proper synchronization 
between data and interrupts transmitted over the fabric. 

It is still a further object of some aspects of the 
present invention to provide a method and system for 
communication between a CPU and peripheral devices via a 
switching fabric that reduces latency and processing time 
required for servicing of interrupts by the CPU. 

In preferred embodiments of the present invention, a 
CPU and a peripheral device are linked to a packet- 
switching fabric by respective host and target network 
interfaces. The target interface receives data over a 
local bus from the peripheral device, for transmission in 
the form of packets to a system memory associated with 
the CPU. After sending the data, the peripheral device 
asserts an interrupt. The interrupt from the device is 
connected to an interrupt input of the target interface, 
rather than directly to the CPU or to a central system 
controller, as in systems known in the art. In response 
to the interrupt, the target interface reads the 
interrupt cause from the peripheral device, and then 
sends a special interrupt packet, including the interrupt 
cause, to the host interface. Preferably, the target 
interface sends the interrupt packet on the same channel 
as it sent the data packets, i.e., over the same "virtual 
lane," or route, and with the same priority as the data 
packets. It thus assures that the host interface will 
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receive the interrupt packet only after it has received 
all of the preceding data packets. 

Upon receiving the interrupt packet, the host 
interface places the interrupt cause in a predefined 
register in the memory. An interrupt signal is then sent 
from the host interface to an interrupt input of the CPU. 
Upon receiving the signal, the CPU checks to ensure that 
the host interface has finished writing all of the data 
from the peripheral device to the memory. This check 
serves a similar purpose to the configuration read 
described in the Background of the Invention. Only after 
completing the check does the CPU read the interrupt 
cause and begin processing the data in the memory. The 
CPU performs all of these steps locally, communicating 
with the host interface and memory over a local system 
bus, with latency on the order of nanoseconds, rather 
than having to exchange messages with the peripheral 
device through the switching fabric, taking many 
microseconds. As a result, interrupt response latency is 
minimized, and the CPU does not waste precious time and 
resources waiting for the configuration read response. 

In preferred embodiments of the present invention, 
the switching fabric comprises an InfiniBand network, and 
the host and target interfaces respectively comprise host 
and target channel adapters. It will be appreciated, 
however, that the principles of the present invention may 
similarly be applied to transmission of interrupts 
through substantially any packet-switched network. 

There is therefore provided, in accordance with a 
preferred embodiment of the present invention, a method 
for conveying data over a packet-switching network, 
including: 
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receiving data from a peripheral device for 
transmission via the network to a memory associated with 
a central processing unit (CPU) ; 

receiving an interrupt signal from the peripheral 
device associated with the data/ 

sending one or more data packets containing the data 
over the network to a host network interface serving the 
memory and the CPU; and 

sending an interrupt packet over the network to the 
host network interface, responsive to which an interrupt 
input of the CPU is asserted only after the one or more 
Q data packets have arrived at the host network interface. 

^ Typically, receiving the data includes receiving 

Ul parallel data over a local bus from the peripheral 

y. device. Additionally or alternatively, receiving the 

in data includes receiving data to be written to the memory 

s by direct memory access. 

C Preferably, sending the interrupt packet includes 

pj reading a cause of the interrupt from the peripheral 

Nj device, and incorporating the cause in the interrupt 

q packet. Further preferably, the method includes 

receiving the interrupt packet at the host network 
interface, and writing the cause to a predetermined 
address in the memory, to be read by the CPU after the 
interrupt input is asserted. 

In a preferred embodiment, sending the interrupt 
packet includes sending the interrupt packet after 
receiving an acknowledgment from the memory that the data 
have been written thereto. 

Preferably, sending the one or more data packets 
includes sending the data packets over a selected channel 
through the network, and sending the interrupt packet 
includes sending the interrupt packet over the selected 
channel following the data packets. 
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Further preferably, the method includes: 

receiving the data packets and the interrupt packet 
at the host network interface; 

conveying the data in the packets for delivery to 
the memory over a local bus coupling the host network 
interface to the memory and the CPU; and 

notifying the CPU when all of the data have been 
conveyed. 

Most preferably, conveying the data in the packets 
includes passing the data to a system controller on the 
bus, and notifying the CPU includes informing the CPU 
when an acknowledgment is received by the host network 
interface from the system controller, typically by 
asserting the interrupt input of the CPU after the 
acknowledgment from the system controller has been 
received. Additionally or alternatively, notifying the 
CPU includes asserting the interrupt input of the CPU 
responsive to receiving the interrupt packet at the host 
network interface . 

There is also provided, in accordance with a 
preferred embodiment of the present invention, network 
interface apparatus, including: 

a target channel adapter, which is operative to 
receive data from a peripheral device for transmission 
via a packet-switching network to a memory associated 
with a central processing unit (CPU) and to send one or 
more data packets containing the data over the network to 
a host network interface serving the memory and the CPU; 
and 

a target interface processor, adapted to receive an 
interrupt signal from the peripheral device associated 
with the data, and to send an interrupt packet over the 
network to the host network interface, responsive to 
which an interrupt input of the CPU is asserted only 
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after the one or more data packets have arrived at the 
host network interface. 

There is further provided, in accordance with a 
preferred embodiment of the present invention, network 
interface apparatus, including: 

a host channel adapter, which is operative to 
receive data packets transmitted over a packet-switching 
network from a peripheral device, and to convey data from 
the packets for delivery to a memory associated with a 
CPU over a local bus that is coupled to the memory and 
the CPU, and further to receive an interrupt packet sent 
over the network responsive to an interrupt signal 
asserted by the peripheral device after sending the data 
to the network; and 

a host interface processor, adapted, responsive to 
the interrupt packet, to notify the CPU when all of the 
data have been conveyed to the local bus . 

Preferably, the target and host channel adapters 
include InfiniBand adapters. 

The present invention will be more fully understood 
from the following detailed description of the preferred 
embodiments thereof, taken together with the drawings in 
which: 
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BRIEF DESCRIPTION OF THE DRAWINGS 
Fig. 1 is a block diagram that schematically 
illustrates a computing system based on a packet- 
switching fabric, in accordance with a preferred 
embodiment of the present invention; 

Fig. 2 is a flow chart that schematically 
illustrates a method for transmitting data from a 
peripheral device to a CPU in the system of Fig. 1, in 
accordance with a preferred embodiment of the present 
invention; and 

Fig. 3 is a flow chart that schematically 
illustrates a method for processing data received by the 
CPU in the system of Fig. 1, in accordance with a 
preferred embodiment of the present .invention. 
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DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 

Fig. 1 is a block diagram that schematically 
illustrates a computing system 20 built around a 
switching fabric 2 6, in accordance with a preferred 
embodiment of the present invention. The switching 
fabric preferably comprises an InfiniBand fabric, as 
described in the Background of the Invention, and some of 
the terms used hereinbelow are specific to the InfiniBand 
architecture. It will be understood, however, that the 
system architecture and methods of communication 
described herein are in no way limited to InfiniBand, and 

0 that other switching fabrics, as are known in the art, 
sj| may be configured to handle and convey interrupts in a 
U1 similar manner. 

y A CPU 21 is coupled to communicate via a system bus 

Ul 52 with a system controller 24 and a system memory 22, as 

1 is known in the art- Typically (although not 
O necessarily) , the CPU comprises an Intel Pentium 
fu processor, and bus 52 is a proprietary bus used in 
Nl conjunction with this processor. System controller 24 is 
5 coupled to a standard I/O bus 50, such as a PCI bus, for 

the purpose of communicating with peripheral devices, 
such as I/O adapters of various types. One such 
peripheral device 25 is shown in Fig. 1 by way of 
example, but in practical applications, system 20 
typically comprises multiple peripheral devices and, 
possibly, multiple CPUs. Peripheral device 25 includes 
an interrupt output 4 8, which it asserts in order to gain 
the attention of the CPU. In systems known in the art, 
interrupt output 4 8 is connected directly to an interrupt 
controller 38, such as an Intel 8259 device, which 
actuates an appropriate interrupt input 27 of CPU 21 when 
the interrupt is asserted. In system 20, however, 
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interrupt output 48 and input 27 are linked only through 
fabric 26, as described hereinbelow. 

Bus 50 is coupled to fabric 2 6 by a host network 
interface unit 28. This unit comprises a host channel 
adapter (HCA) 32 , which interfaces with bus 50 and 
converts data between packet and parallel forms. 
Alternatively, the HCA may be designed to interface with 
system bus 52. A switch 30 links the HCA to one or more 
core switches in the fabric. Ordinarily, data in packets 
received by switch 30 from fabric 26 are passed through 
HCA 32 to bus 50. An exception is made, however, for 
management packets, which are packets that carry a 
special header identifying themselves as such and 
including a local identifier (LID) address of either 



y switch 30 or HCA 32. These packets contain control 

instructions for the switch or HCA. They are placed in a 
dedicated register of the switch or HCA, as appropriate, 
Jj which then attempts to decode the instructions and carry 

pj them out. Typically, the processing capabilities of the 

^ switch and HCA are very limited, and they are assisted by 

O a fabric service agent (FSA) , as described below, in 

dealing with at least some of these management packets. 

A host interface unit controller 36 acts as the FSA 
in interface unit 28. The controller preferably 

comprises a microprocessor with random access memory 
(RAM) for software code and data, communicates with HCA 
32 and switch 30. Alternatively, the controller may 
comprise a hard-wired hardware element or digital signal 
processor. When HCA 32 or switch 30 receives a 
management packet that it cannot decode, it passes the 
packet to the controller. The controller decodes the 
packet, preferably based on suitable software stored in 
its code RAM. It then takes whatever action is called 
for by the packet, such as giving appropriate 
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instructions to HCA 32 or switch 30. When the HCA 
receives an interrupt packet, as described below, the 
actions taken by controller 36 also include signaling 
interrupt controller 38 via an interrupt output of unit 
28, so as to actuate interrupt input 27 of CPU 21. 

Although for simplicity, only a single interrupt 
line from unit 28 to controller 38 is shown in Fig. 1, 
the unit preferably comprises multiple interrupt lines. 
These lines can be actuated selectively by controller 36 
so as to send multiple, different interrupts to CPU 21 
depending on the content of interrupt packets received by 
5 the HCA. Alternatively or additionally, the different 

jjj interrupt lines may be used to signal other host devices 

yi that are linked to bus 50. 

Hi Peripheral device 25 is coupled to fabric 2 6 by a 

y i 

rU target network interface unit 40, similar in structure to 

~ unit 28. A target channel adapter (TCA) 42 in unit 40 

dp interfaces via an I/O bus 53 with device 25. Typically, 

nj 

lT> although not necessarily, bus 53 comprises a PCI bus, 

Q like bus 50. A switch 44 links the TCA to the switching 

fabric. A target unit controller 4 6, similar to 
controller 36, acts as FSA to TCA 42 and switch 44 and 
also has a suitable input to receive signals from 
interrupt output 48 of device 25. 

Fig. 2 is a flow chart that schematically 
illustrates a method by which target interface unit 40 
processes and transmits data from peripheral device 25 to 
HCA 32 over fabric 2 6, in accordance with a preferred 
embodiment of the present invention. At a data writing 
step 60, device 25 writes data via bus 53 to TCA 42, to 
be conveyed by direct memory access to memory 22. The 
peripheral device assigns a priority to the data to be 
transmitted and informs the TCA of this priority. At a 
data sending step 62, the TCA packet! zes the data and 
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sends it over fabric 26 to the address of HCA 32, with 
the priority assigned by the peripheral device. A packet 
header instructs the HCA to write the data to memory 22. 
Preferably, the TCA negotiates with switch 44 and fabric 
2 6 to assign a fixed route for all of the packets through 
the fabric. Such a route, together with the priority of 
the packets, is referred to herein as a channel. 
InfiniBand specifies that packets travelling over the 
same channel are always kept in their original order. 

When device 25 has finished posting to TCA 42 all of 
the data that it has to send, it asserts interrupt output 
^ 48, at an interrupt assertion step 64. At the same time, 

p the peripheral device places the cause for the interrupt 

m (in this case, to instruct CPU 21 to read .the .data . from 

P memory 22) in an interrupt cause register 49. In systems 

UJ 

m known in the art, when the CPU receives the interrupt, it 

^ must communicate with the peripheral device in order to 

s 

□ read this register. In system 20, however, the interrupt 

X signal is received by controller 46, which instructs TCA 

\J 42 to read the interrupt cause from register 4 9, at a 

cause reading step 66. 

Based on the interrupt cause information read by the 
TCA, controller 46 constructs an interrupt packet 
containing the interrupt cause information, at an 
interrupt packet sending step 68. The interrupt packet 
is a management packet addressed to the LID of HCA 32. 
It is preferably sent by controller 4 6 over the same 
channel, or virtual lane, as the data packets, after the 
last of the data packets has been sent. The interrupt 
packet also identifies the data with which the interrupt 
is associated. As a result, when the interrupt packet 
arrives at its destination, controller 36 will be able to 
generate an interrupt to CPU 21 that is associated with 
the appropriate memory write, as described below. 
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Controller 4 6 assures than interrupt packet is sent to 
the fabric after all of the data packets have already 
been accepted for sending. It thus ensures that HCA 32 
will receive the interrupt packet only after it has 
received all of the data packets. 

As an alternative, controller 4 6 may delay sending 
the interrupt packet until TCA 42 receives an 
acknowledgment from memory 22 that it has received all of 
the data. This approach introduces additional delay 
before CPU 21 can receive and act upon the interrupt, but 
it obviates the need to ensure that the interrupt packet 
O is routed over the same channel as the data packets. 

Such an approach may be called for in particular when 
lh switching fabric 2 6 comprises a network in which 

*p t . consistent routing and ordering are not necessarily 

U1 maintained among successive packets. This approach can 

^ also be used when the interrupt path and data path are 

O not the same, and fork at an earlier stage than in Fig. 

1. Such path incongruity may occur, for example, when 



4* 
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SJ the device writing data to the memory is different from 

O 

o 



the device asserting the interrupt to the CPU. Sometimes 
it is also desirable to send interrupts on different 
(high-priority) routes, because data routes can be 
congested, causing interrupt messages to get stuck behind 
data. 

Fig. 3 is a flow chart that schematically 
illustrates a method by which data and accompanying 
interrupt packets are received and processed by host 
interface unit 2 8 and CPU 21, in accordance with a 
preferred embodiment of the present invention. At a 
packet reception step 70, HCA 32 receives the data and 
interrupt packets sent from target interface unit 40. 
The HCA posts the data in the data packets via bus 50 to 
a buffer 58 of system controller 24. The system 
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controller proceeds to write the data from its buffer to 
the appropriate addresses in memory 22, as is known in 
the art. The HCA passes the interrupt packet to 
controller 3 6 for decoding, at an interrupt processing 
step 72. The controller extracts the cause of the 
interrupt and posts this information, via HCA 32, to an 
interrupt cause register 56 in memory 22. 

Before CPU 21 services the interrupt represented by 
the interrupt packet, it is necessary to ensure that all 
of the associated data have been written to memory 22, at 
a delivery completion step 74. In the case that 
controller 4 6 of target interface unit 40 is programmed 
to send the interrupt packet only after receiving the 
acknowledgment from memory. 22, as described above, this 
problem is already solved. Otherwise, controller 36 
preferably waits to assert the interrupt until system 
controller 24 has acknowledged to HCA 32 that it has 
received all of the data. In response to this 

acknowledgment, controller 36 sends an interrupt signal 
to interrupt controller 38, at an interrupt assertion 
step 76. The interrupt controller actuates interrupt 
input 27 of CPU 21, to inform the CPU that an interrupt 
has arrived from HCA 32. In response to the interrupt, 
the CPU preferably sends a dummy read command to the HCA, 
in order to ensure that buffer 58 is flushed to memory 22 
before the CPU itself begins to process the data in the 
memory. 

As a further alternative, as long as it is assured 
that the interrupt packet reached HCA 32 after the last 
of the data packets (which will be the case when all of 
the packets are sent over the same channel, as described 
above), controller 36 may send the interrupt signal to 
interrupt controller 3 8 immediately, without waiting for 
an acknowledgment from system controller 24 • In this 
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case, upon receiving the interrupt, CPU 21 preferably 
sends a "fence" command to HCA 32. This command 
instructs the HCA to mark the last packet currently in 
its receive queue, and to inform the CPU when this last 
packet has been written to system controller 24. At this 
point, the CPU can send its dummy read command and begin 
processing the data in the memory. 

Once it is assured that all of the relevant data 
have reached their destination in memory 22, CPU 21 reads 
the cause of the current interrupt from register 56, at a 
cause reading step 78. Based on this information, the 
CPU processes the data that peripheral device 25 has 
placed in the memory, at a data processing step 80. 
Unlike methods of interrupt processing known in the art, 
all of the steps in the method of Fig. 3 are carried out 
locally, typically over busses 50 and 52, without the 
need for messages to traverse fabric 26. 

It will be appreciated that the preferred 
embodiments described above are cited by way of example, 
and that the present invention is not limited to what has 
been particularly shown and described hereinabove. 
Rather, the scope of the present invention includes both 
combinations and subcombinations of the various features 
described hereinabove, as well as variations and 
modifications thereof which would occur to persons 
skilled in the art upon reading the foregoing description 
and which are not disclosed in the prior art. 
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CLAIMS 

1. A method for conveying data over a packet-switching 
network, comprising: 

receiving data from a peripheral device for 
transmission via the network to a memory associated with 
a central processing unit (CPU) ; 

receiving an interrupt signal from the peripheral 
device associated with the data; 

sending one. or more data packets containing the data 
over the network to a host network interface serving the 
memory and the CPU; and 

sending an interrupt packet over the network to the 
host network interface, responsive to which an interrupt 
input of the CPU is asserted only after the one "or more 
data packets have arrived at the host network interface. 

2. A method according to claim 1, wherein receiving the 
data comprises receiving parallel data over a local bus 
from the peripheral device. 

3. A method according to claim 1, wherein receiving the 
data comprises receiving data to be written to the memory 
by direct memory access. 

4. A method according to claim 1, wherein sending the 
interrupt packet comprises reading a cause of the 
interrupt from the peripheral device, and incorporating 
the cause in the interrupt packet. 

5. A method according to claim 4, and comprising 
receiving the interrupt packet at the host network 
interface, and writing the cause to a predetermined 
address in the memory, to be read by the CPU after the 
interrupt input is asserted. 

6. A method according to claim 1, wherein sending the 
interrupt packet comprises sending the interrupt packet 
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after receiving an acknowledgment from the memory that 
the data have been written thereto. 

7. A method according to claim 1, wherein sending the 
one or more data packets comprises sending the data 
packets over a selected channel through the network, and 
wherein sending the interrupt packet comprises sending 
the interrupt packet over the selected channel following 
the data packets • 

8. A method according to claim 1, and comprising: 
receiving the data packets and the interrupt packet 

at the host network interface; 

conveying the data in the packets for delivery to 
the memory over a local bus coupling the host network 
interface to the memory and the CPU; and 

notifying the CPU when all of the data have been 
conveyed. 

9. A method according to claim 8, wherein conveying the 
data in the packets comprises passing the data to a 
system controller on the bus, and wherein notifying the 
CPU comprises informing the CPU when an acknowledgment is 
received by the host network interface from the system 
controller. 

10. A method according to claim 9, wherein informing the 
CPU comprises asserting the interrupt input of the CPU 
after the acknowledgment from the system controller has 
been received. 

11. A method according to claim 8, wherein notifying the 
CPU comprises asserting the interrupt input of the CPU 
responsive to receiving the interrupt packet at the host 
network interface . 

12. Network interface apparatus, comprising: 
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a target channel adapter, which is operative to 
receive data from a peripheral device for transmission 
via a packet-switching network to a memory associated 
with a central processing unit (CPU) and to send one or 
more data packets containing the data over the network to 
a host network interface serving the memory and the CPU; 
and 

a target interface processor, adapted to receive an 
interrupt signal from the peripheral device associated 
with the data, and to send an interrupt packet over the 
network to the host network interface, responsive to 
which an interrupt input of the CPU is asserted only 
after the one or more data packets have arrived at the 
host network interface. _ ....... 

13. Apparatus according to claim 12, wherein the target 
channel adapter comprises an interface to a local 
parallel bus linked to the peripheral device, over which 
the device sends the data. 

14. Apparatus according to claim 12, wherein the target 
channel adapter is operative to read a cause of the 
interrupt from the peripheral device, and wherein the 
processor is adapted to incorporate the cause in the 
interrupt packet. 

15. Apparatus according to claim 14, and comprising a 
host channel adapter, coupled to receive the interrupt 
packet at the host network interface, and to write the 
cause to a predetermined address in the memory, to be 
read by the CPU after the interrupt input is asserted. 

16. Apparatus according to claim 12, wherein the 
processor is adapted to send the interrupt packet after 
receiving an acknowledgment from the memory that the data 
have been written thereto. 
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17. Apparatus according to claim 12, wherein the target 
channel adapter is coupled to send the data packets over 
a selected channel through the network, and wherein the 
processor is adapted to send the interrupt packet over 
the selected channel following the data packets. 

18. Apparatus according to claim 17, and comprising a 
switch coupling the target channel adapter and the 
processor to the network, wherein the switch comprises a 
receive queue into which the target channel adapter 
places the data packets, and wherein the processor is 
adapted to place the interrupt packet into the receive 

O 

j=* queue following the data packets. 

iff 

ij\ 19. Apparatus according to claim 12, and comprising a 

host interface unit, which is coupled to receive the data 
and interrupt packets transmitted over the network, and 
3 ^ is operative to convey the data in the packets for 

s 

p delivery to the memory over a local bus coupled to the 

£ memory and the CPU and to notify the CPU when all of the 

SJ data have been conveyed. 

d 

q 20. Apparatus according to claim 19, wherein the host 

interface unit is coupled to assert the interrupt to the 
CPU responsive to the interrupt packet. 

21. Apparatus according to claim 12, wherein the target 
channel adapter comprises an InfiniBand adapter. 

22. Network interface apparatus, comprising: 

a host channel adapter, which is operative to 
receive data packets transmitted over a packet-switching 
network from a peripheral device, and to convey data from 
the packets for delivery to a memory associated with a 
CPU over a local bus that is coupled to the memory and 
the CPU, and further to receive an interrupt packet sent 
over the network responsive to an interrupt signal 
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asserted by the peripheral device after sending the data 
to the network; and 

a host interface processor, adapted, responsive to 
the interrupt packet, to notify the CPU when all of the 
data have been conveyed to the local bus. 

23. Apparatus according to claim 22, wherein the host 
channel adapter is operative to convey the data to the 
memory by direct memory access. 

24. Apparatus according to claim 22, wherein the host 
channel adapter is operative to convey the data to a 
system controller on the bus, and wherein the CPU is 

Jjj notified when an acknowledgment is received by the host 

Ln channel adapter from the system controller. 

m 

C 5 25. Apparatus according to claim 24, wherein the host 

j~ interface processor is coupled to assert the interrupt 

nj input of the CPU after the acknowledgment from the system 

p controller has been received. 

py 26. Apparatus according to claim 22, wherein the host 

%4 interface processor is coupled to assert the interrupt 

S input of the CPU responsive to receipt of the interrupt 

packet at the host network interface. 

27. Apparatus according to claim 22, wherein the host 
channel adapter comprises an InfiniBand adapter. 
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ABSTRACT 

A method and apparatus for conveying data over a 
packet-switching network. Data are received from a 
peripheral device for transmission via the network to a 
memory associated with a central processing unit (CPU) , 
followed by an interrupt signal from the peripheral 
device associated with the data. One or more data 
packets containing the data are sent over the network to 
a host network interface serving the memory and the CPU, 
followed by an interrupt packet sent over the network to 
the host network interface. Responsive to the interrupt 
O packet, an interrupt input of the CPU is asserted only 
y| after the one or more data packets have arrived at the 

V host network interface. 
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